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CN ■ ABSTRACT 

^ I The ~ 800 optically unseen {R > 25.5) 24 ^m-selected sources in the complete Spitzer 

First Look Survey sample (Fadda et al. 2006) with ^24Atm > 0.35 mJy are found to be 
very strongly clustered. If, as indicated by several lines of circumstantial evidence, they 
are ultraluminous far-IR galaxies at 2: ~ 1.6-2.7, the amplitude of their spatial corre- 
lation function is very high. The associated comoving clustering length is estimated 
\^ I to be To = 14.0^2^4 Mpc, value which puts these sources amongst the most strongly 

. clustered populations of our known universe. Their 8 /xm-24 fim colours suggest that 

the AGN contribution dominates above -F24^m — 0.8 mJy, consistent with earlier anal- 
yses. The properties of these objects (number counts, redshift distribution, clustering 
amplitude) are fully consistent with those of proto-spheroidal galaxies in the process 
O ' of forming most of their stars and of growing their active nucleus, as described by the 

Granato et al. (2004) model. In particular, the inferred space density of such galaxies 
c/3 ■ at z ~ 2 is much higher than what expected from most semi-analytic models. 

Matches of the observed projected correlation function 'w{9) with models derived 
within the so-called Halo Occupation Scenario show that these sources have to be 
hosted by haloes more massive than ~ 10^^-^ Mq. This value is significantly higher than 
. ^ that for the typical galactic haloes hosting massive elliptical galaxies, suggesting a du- 

^ . ration of the starburst phase of massive high-redshift dusty galaxies of Tb 0.5 Gyr. 
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tions - cosmology: theory - large-scale structure of the Universe 
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1 INTRODUCTION 

Understanding the assembly history of massive spheroidal 
galaxies is a key issue for galaxy formation models. The 
"naive" expectation from the canonical hierarchical merg- 
ing scenario, that proved to be remarkably successful in ex- 
plaining many aspects of large-scale structure formation, is 
that massive galaxies generally form late and over a long pe- 
riod of time as the result of many mergers of smaller haloes. 
On the other hand, there is quite extensive evidence that 
massive galaxies may form at high redshifts and on short 
timescales (see, e.g. Cimatti et al. 2004; Fontana et al. 2004; 
Glazebrook et al. 2004; Giallongo et al. 2005; Treu et al. 
2005; Saracco et al. 2006; Bundy et al. 2006), while the sites 
of active star formation shift to lower mass systems at later 
epochs, a pattern referred to as "downsizing" (Cowie et al. 
1996). In order to reconcile the observational evidence that 
stellar populations in large spheroidal galaxies are old and 



essentially coeval (Ellis et al. 1997; Holden et al. 2005) with 
the hierarchical merging scenario, the possibility of mergers 
of evolved sub-units ("dry mergers") has been introduced 
(van Dokkum et al. 2005; Faber et al. 2006; Naab et al. 
2006). This mechanism is however strongly disfavoured by 
studies on the evolution of the stellar mass function (Bundy 
et al. 2006). 

Key information, complementary to optical/IR data, 
has come from sub-millimeter surveys (Hughes et al. 1998; 
Bales et al. 2000; Knudsen et al. 2006) which have found a 
large population of luminous sources at substantial redshifts 
(Chapman et al. 2005). However, the interpretation of this 
class of objects is still controversial (e.g. Granato et al. 2004; 
Kaviani et al. 2003; Baugh et al. 2005). 

The heart of the problem are the masses of the objects: 
a large fraction of present day massive galaxies already as- 
sembled at 2; ~ 2 — 3 would be extremely challenging for the 
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standard view of a merging-driven growth. Measurements 
of clustering amplitudes are a unique tool to estimate halo 
masses at high z, but complete samples comprising at least 
several hundreds of sources are necessary. This is far more 
than what detected by sub-mm surveys, that have there- 
fore only provided tentative clustering estimates (Blain et 
al. 2005). 

Here we report evidence of strong clustering for the op- 
tically very faint [R > 25.5) sources included in the complete 
24 /im sample obtained from the Spitzer first cosmological 
survey (Fadda et al. 2006) . Comparisons with template spec- 
tral energy distributions and up-to-date models for galaxy 
formation and evolution set these objects at z ~ 2. The clus- 
tering properties and the counts of such sources are consis- 
tent with them being very massive proto-spheroidal galaxies 
in the process of forming most of their stars. Their comov- 
ing number density appears to be much higher than what 
expected from most semi-analytic models. 

The layout of the paper is as follows. In § 2 we describe 
the sample selection. In § 3 we investigate the source photo- 
metric and spectroscopic properties. In § 4 we derive the two 
point angular correlation function, while in § 5 we present its 
implications for source properties, and in particular for their 
halo masses, in the light of the so-called Halo Occupation 
Model. Comparisons with model predictions are dealt with 
in § 6. Our main conclusions are summarized in § 7. 

Throughout this work we adopt a flat cosmology with 
a matter density Q,m = 0.3 and a vacuum energy density 
Q.A = 0.7, a present-day value of the Hubble parameter in 
units of 100 km/s/Mpc h = 0.7, and rms density fluctuations 
within a sphere of 8/i~^ Mpc radius erg = 0.8 (Spergel et al. 
2003). 



2 THE SAMPLE SELECTION 
2.1 The Parent Catalogue 

Our analysis is based on the 24 ^m data obtained during the 
first cosmological survey performed by the Spitzer Space 
Telescope (First Look Survey). Observations and data re- 
duction are extensively described in Fadda et al. (2006). 
Briefly, the survey consists of a shallow observation of a 
2.5° X 2° area centered at (17''18'", +59°30') (main survey) 
and of a deeper observation on a smaller region of the sky 
(verification survey) overlapping with the first one. 

Observations were performed using the MIPS (Multi 
Imaging Photometer for Spitzer, Rieke et al. 2004), whose 
spatial resolution at 24 /im is 5.9" FWHM. Approximately 
~ 17000 sources have been extracted with signal-to-noise- 
ratio (SNR) greater than five down to ~ 0.2 mjy in the 
main survey and to ~ 0.1 mJy in the verification survey. As- 
trometric errors depend on the SNR, varying between 0.35" 
and 1.1" for sources detected at 20-5(Tlevels. The main sur- 
vey is estimated to be > 90% complete down to a limiting 
flux F24Mm = 0.35 mJy (Fadda et al. 2006). 

Optical counterparts have been obtained by Fadda et 
al. (2006) for most of the 24 /im sources by cross-correlating 
galaxies in the MIPS catalogue with the i?-band KPNO ob- 
servations of Fadda et al. (2004) and - for objects with 
J? < 18 - with sources from the Sloan Digital Sky Sur- 
vey (Hogg et al., in preparation). These two optical data- 
sets cover in a roughly homogeneous way most of the area 
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Figure 1. Distribution of the residuals Ax =RA24pm— RAgpni 
(solid line) and Aj/=Dec24^m— Decgpni (dashed line) between 
24 /tm and 8 /tm positions. 

probed by the 24 /xm main survey, except for three corners. 
The typical limiting magnitude of the joint SDSS-I-KPNO 
observations is R = 25.5, and ~ 82% of the 24 /tm of all 
sources in the MIPS survey are reported to have an optical 
counterpart brighter than this limit. 

Despite ongoing efforts (Marleau et al. 2003; Choi et al. 
2006; Yan et al. 2005), there is still no homogeneous red- 
shift information on the sources making up the MIPS 24 pm 
catalogue, except for a very small area overlapping with the 
GOODS/CDFS field (Caputi et al. 2006). However, redshift 
estimates can be obtained from photometric data, taking ad- 
vantage of the Spitzer Infrared Array Camera (IRAC) survey 
which covers an extensive portion of the MIPS field (Lacy 
et al. 2005). 

The main IRAC survey has covered an area of 3.8 square 
degrees in the four channels centered at 3.6, 4.5, 5.8 and 
8 /im, reaching a ~ 100% completeness level respectively at 
~ 40, ~ 40, ~ 100 and ~ 100 /iJy (Fig. 3 of Lacy et al. 
2005). The positional accuracy goes from ~ 0.25" for high 
signal-to-noise sources to 1" at the lowest fiux levels. 

Investigations of the Spectral Energy Distributions 
(SED) of prototype sources such as M 82, Arp 220, and 
Mkn 231 (the latter with mid-IR luminosity probably pow- 
ered by the presence of an AGN) shows that the tightest 
constraints on the redshifts of very distant galaxies with 
intense star-formation come from the 8 /im IRAC channel, 
since such sources are expected to be very weak at shorter 
wavelengths (see also Yan et al. 2005). Therefore, in the fol- 
lowing we will only consider data from the 8 /tm channel. 

2.2 Matching procedure 

We have looked for the 8 /im counterparts to MIPS sources 
over the 2.85 square degrees region 257.7° < RA(2000) < 
261° and 58.6° < Dec < 60.3°, for which both 8 /xm and 7?- 
band observations are available. In this area there are 7592 
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24 /im sources and 8,646 8 jj,m sources above the respective 
completeness limits of 0.35 mjy and 0.1 mjy. 

We identified as the counterpart to a MIPS source the 
IRAC source with a positional separation less than a suit- 
ably chosen radius. As mentioned in § 2.1, the positional 
accuracies for both MIPS and IRAC sources with SNR = 5 
is ~ 1", so that we expect a rms positional difference due to 
astrometric errors of ~ \/2" ~ 1.4". Moreover, the IRAC as- 
trometry is based on 2MASS, while that of the 24 fxm sources 
is related to SDSS. Although both systems are very accu- 
rate, a small systematic offset may be present. 

Tackling the problem from a more pragmatic point 
of view, we considered the distribution of the residuals 
Ax = RA24Mm - RAs^m , Ay = Dec24Mm — Decgprn between 
the positions of all 24/ini and 8/im pairs with separations 
|Aa;| and \Ay\ less than 5". The distribution of residuals 
shows a strong concentration of points near Aa; ~ —0.14 
and Ay ~ 2.24 ■ lO^'* arcsec, values which can be taken 
as the mean positional offsets between the 24 pm and 8 pm 
reference frames. 

We have corrected for this effect and in Fig. 1 we plot 
the histogram of the number of matches as a function of Ax 
(solid line) and Ay (dashed line) offsets. The distributions 
now correctly peaks near zero offset with a rms value of 
about 1".5, in agreement with the above simple estimate. 
We have then chosen a 3" matching radius - equivalent to 
about 2(7 - which should therefore include ~ 95% of the true 
identifications. The probability that an 8/im source falls by 
chance within the search radius from a 24 /jm source (equal 
to the surface density of Sfixa sources times the area within 
such radius) is ~ 6.6 x 10~^. Increasing the matching radius 
increases the number of interlopers more than that of true 
counterparts. 

The above procedure yielded 3429 F24fiin > 0.35 mJy 
24 /xm sources endowed with an 8 /um counterpart. Given the 

completeness limit of the IRAC survey, we will then assume 
the remaining MIPS objects to have an 8 pm counterpart 
fainter than 0.1 mJy. 



3 PHOTOMETRIC AND SPECTROSCOPIC 
PROPERTIES 

The distribution of 8fim to 24 /nm vs 0.7 /nm to 24jLim flux 
ratios for all the 7592 F24^m > 0.35 mJy MIPS sources in the 
2.85 deg^ region covered by both the KPNO and the IRAC 
surveys is reported in Fig. 2. R magnitudes have been con- 
verted to 0.7/im fluxes using the calibration in Fukugita, 
Shimasaku & Ichikawa (1995). For sake of clarity, objects 
with an 8 pm counterpart fainter than 0.1 mJy have been 
given Fs^m = 10~^ mJy and are responsible for the appar- 
ent gap observed in the lower part of the Fsfim / F24fiin axis, 
while sources without an optical counterpart in the KPNO 
catalogue have all been given R=25.5 and are represented 
by the red or blue filled circles. Blue circles indicate objects 
with Fsp,ui / F241J.T11 > 0.1, while the red ones are for those 

with F8,im/J24Mm < 0.1. 

The solid (green) lines show, as a function of redshift, 
the colours corresponding to the SEDs of Arp 220 (left-hand 
panel), a well studied local starburst gala^Ky - featuring in 
its mid-IR spectrum signatures of heavy dust absorption 
(Spoon et al. 2004) - found to provide to a first approx- 



imation a good template to describe the energy output of 
high-redshift galaxies undergoing intense star-formation (see 
e.g. Pope et al. 2006), and of Mkn 231 (right-hand panel), a 
prototype absorbed AGN dominating the mid-IR emission, 
hosted in a galaxy with very intense star formation. This 
figure shows that, for both source types, extreme 24/Lim to 
i?-band flux ratios (or /J > 25.5) likely correspond to sources 
at 2; ~ 1.6-3. 

This conclusion is supported by the comparison of the 
distribution of the -Fi.4GHz/-p24^m vs -Fo.7Mm/-F24^m colours 
for the R > 25.5 sources from the complete MIPS sam- 
ple with the track, as a function of redshift, yielded by the 
Arp 220 SED (left-hand panel of Fig. 3) . Radio data come 
from the 20 cm radio survey performed by Condon et al. 
(2003) on 82% of the 24 fira field down to a limiting flux of 
0.1 mJy. Only 86 out of the 793 R > 25.5, F2i^n, > 0.35 mJy 
sources (11% of the sample) are detected. The lower dashed 
curve in the right-hand panel of Fig. 3 details the redshift 
dependence of the -Fi.4GHz/-F24Mm ratio for the Arp 220 SED, 
showing that sources undergoing intense star-formation and 
endowed with 24 /um fluxes close to the MIPS detection limit, 
present 1.4 GHz fluxes below the 0.1 mJy threshold of Con- 
don's survey only if 2: < 1.2 or 1.7 < z < S. Combin- 
ing this latter piece of information with the trend of the 
i^i.4GHz/J24Mm VS Jo.7Mm/J24,im distribution. We can con- 
clude that most of the R > 25.5 MIPS sources are consistent 
with them being starburst galaxies placed in the redshift 
range 1.6 < 2 < 3. Although the above arguments do not 
exclude the possibility that some of our sources are actu- 
ally at 2: ~ 1, a really extreme extinction would be required 
to make them fainter than R — 25.5, and therefore such 
sources must be very rare. This is directly confirmed by the 
spectroscopic observations summarized below, that did not 
find R > 25.5 objects at 2 < 1.7. 

The upper dashed curve in the right-hand panel of 
Fig. 3 shows that galaxies with a SED similar to Arp 220, 
fluxes F24^m — 0.35 mJy, and lying in the redshift range 
1.6 < 2 < 3 have JsBOpm ^ 12 mJy. As pointed out by 
Houck et al. (2005), sources with higher AGN contribution 
(in general brighter at 24 /nm than the above limit) which 
therefore present hotter dust, exhibit -Fsso/jm /-F24Mm ratios 
lower by a typical factor of 3-5 than those of (sub)-mm se- 
lected sources (Fs5o^lIn/F24^lIn — 5; see e.g. Lutz et al. 2005), 
generally high-2 starburst galaxies (Ivison et al. 2004; Egami 
et al. 2004; Frayer et al. 2004; Charmandaris et al. 2004; 
Pope et al. 2006). A small fraction (151 arcmin^) of the 
area covered by the First Look Survey has been observed by 
Sawicki & Webb (2005) with SCUBA on JCMT. These au- 
thors report the detection {S/N > 3.5) of ten sources with 
^ssOfim > 10 mJy. As expected on the basis of the above 
discussion, none of them belongs to the complete sample of 
R > 25.5 MIPS sources although one of the detected ob- 
jects is just below the 0.35 mJy limit (J171736.9-I-593354, 
-F24Mm = 0.32 mJy). 

Thus, radio, sub-mm, mid-IR and optical photometric 
data converge in indicating that the process of extracting op- 
tically faint sources from samples selected at 24 /im singles 
out star-forming galaxies in the redshift range 1.6 ^ 2 < 3. 
As pointed out by Houck et al. (2005) such a redshift range is 
determined by well established selection effects. The require- 
ment for the sources to be optically very faint {R > 25.5) 
forces them to 2; > 1, because obscuration is higher in the 
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Figure 2. Distribution of the 8^m to 24 vs the 0.7/im to 24 flux ratios for the 7592 -F24^m > 0.35 mJy MIPS sources in the 
2.85 dcg^ region covered by both the KPNO and the IRAC surveys. These are compared with the computed colour-colour tracks as a 
function of redshift (green solid lines) for the Arp 220 (left-hand panel) and the Mkn 231 (right-hand panel) SEDs. For sake of clarity, 
objects with an 8 ^im counterpart fainter than 0.1 mJy have been given -Fg^ni = 10~^ rnJy and are responsible for the apparent gap 
observed in the distribution along the y axis, while sources without an optical counterpart in the KPNO catalogue have been attributed 
-R = 25.5 and are represented by the filled circles. Blue circles indicate objects with -F8^m/^24Mm > 0.1, while the red ones are for those 
with F8(jm/^24nm < 0.1. Note that, although this in not clear from the figure, because many red points are piling up in the same spot, 
objects with -Fg/jm < 0.1 are much more numerous than those with -Fgfim /^24fim > 0.1. Open squares represent sources from the 
Pope et al. (2006) sample with -F24fim > 0.15 mJy. 



rest-frame UV. On the other hand, the deep and broad 
9.7 /im silicate absorption feature, which is common in ultra- 
luminous infrared galaxies (Armus et al. 2004; Higdon et al. 
2004; Spoon et al. 2004) works against inclusion in 24 /im- 
selected samples of objects in the redshift range 1 z < 1.6, 
while the strongest PAH emission feature (set in the rest- 
frame at A = 7.7/im) enters the 24 ^m filter at z ~ 2.1, and 
another relatively strong PAH line (A = 6.2/im) appears at 
24 /im for z ~ 2.9. Ultra-luminous IR galaxies at still higher 
redshifts are expected to become increasingly rare because 
of the dearth of very massive galactic haloes. 



The above conclusion is fully borne out by spectro- 
scopic data, although only obtained for a limited num- 
ber of sources. Yan et al. (2005) obtained low-resolution 
spectra with the Spitzer InfraRed Spectrograph (IRS) 
for eight First Look Survey sources with 24/im fluxes 
brighter than 0.9 mJy. Further colour constraints which 
were applied include: logj^g(!^F^(24/im)/!/F^(8/im)) > 0.5 
and logio(i/F^(24/im)/z/F^(0.7Mm)) > 1.0. Three of these 
sources (namely IRS2, IRS8, and IRS9) have R > 25.5. AU 
three lie in the redshift range 1.8 ^ z ^ 2.6 {zib.S2 = 2.34; 
ziRSS = 2.6; ziRsg = 1.8). IRS2 and IRS9 show strong 
PAH emission lines and moderate silicate absorption in their 
spectra, while IRS8 only presents strong silicate absorption. 
IRS9 has also been observed with MIPS at 70/im and with 
MAMBO at 1.2 mm, and found to have fluxes of respec- 
tively 42 mJy and 2.5 mJy. The estimated bolometric lumi- 



nosities (Yan et al. 2005) are Lboi = 1.83 • IO^Lq (IRS9), 
Lboi = 4.3 • 10"Lq (IRS2), and Lboi = 2 ■ IO'^Lq (IRS8). 

Houck et al. (2005) used the Spitzer Telescope to image 
at 24 ^m a 9 deg^ field within the NOAO Deep Wide-Field 
Survey region down to a fiux of 0.3 mJy. Thirty-one sources, 
with F24^m > 0.75 mJy and R > 24.5 have further been ob- 
served with the IRS. Redshift determinations were possible 
for 17 of them, including 13 sources with R > 25.5. Again, 
the measured redshifts are all in the range 1.7 < 2 < 2.6, ex- 
cept possibly for one object which appears to have z — 0.7, 
but this is the least secure determination due to a poor spec- 
trum beyond 30/im. 

Eighteen optically faint {R > 23.9) sources from the 
Spitzer First Look Survey, with F2ifj,m > 1 mJy and 20 cm 
detections to a limit of 115 /xJy, have been observed with the 
IRS by Weedman et al. (2006). All sources with i? > 24 he 
in the range 1.7 < 2 < 2.5. 

Furthermore, Pope et al. (2006) have recently presented 
24/im observations of 35 sub-mm selected sources with 
850/im fiuxes > 2 mJy. Nine of these sources have 24 /xm 
fiuxes brighter than 0.15 mJy, the limit of the First Look 
Survey in the verification region, and zab > 24 (7 have 
ZAB > 26.2). All of them have spectroscopic or photometric 
redshifts in the range z ~ [1.7 — 2.7]. Their colours, shown 
in Fig. 2 by the magenta open squares and in the left-hand 
panel of Fig. 3 by filled squares, lie in those regions identi- 
fied by the F24^m > 0.35 mJy and R > 25.5 MIPS objects, 
confirming that the above selection singles out galaxies with 
properties similar to those detected by (sub)-mm surveys. 
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Figure 3. Left-hand panel: distribution of the 1.4 GHz to 24 vs the 0.7 /xm to 24 /jm flux ratios for the 793 i^24fim > 0.35 mjy, 
R > 25.5 MIPS sources present in the 3.97 deg'^ region considered in this work. For sake of clarity, objects fainter than the 0.1 mJy 
detection limit at 1.4 GHz have been attributed -Fi.4GHz = 5 ' 10~^ rnjy and produce the diagonal line on the bottom right corner of 
the plot. Sources undetected by the KPNO survey have been assigned R = 25.5. The filled squares indicate objects from the Pope et 
al. (2006) sample with -F24^in > 0.15mjy, while the solid (green) line shows the track, as a function of redshift, yielded by the Arp 220 
SED. Right-hand panel: redshift dependence of the i^850)im/^24)im (upper curve) and i^i.4GHz/^24/jm (lower curve) flux density ratios 
for the Arp 220 SED. The upper dotted line shows the ratio between the limiting fluxes of our sample (-F24;im = 0.35 mJy) and of the 
dataset by Sawicki & Webb (2005; Fgsoum = 10 mJy), while the short-long dashed one corresponds to decreasing -FssO/jm to 2 mJy. The 
lower dotted line is the ratio between the limiting 24^m and 1.4 GHz fluxes (-Fi.4GHz = 0.1) for our sample. 



On the basis of the above results obtained on both pho- 
tometric and spectroscopic grounds, we can then confidently 
assume that sources with F^A^m > 0.35 mJy and R > 25.5 
typically identify star-forming galaxies in the redshift inter- 
val 1.6 ^ z ^ 3. Such a conclusion will be further strength- 
ened in § 6, in the hght of the most up-to-date models for 
galaxy formation and evolution. 



3.1 Starburst vs AGN components 

There are 510 sources with R > 25.5 and F24^m > 0.35 mJy 
in the 2.85 deg'^ region where both 24 ^m and 8 fim data are 
available (see § 2.2). Figure 2 illustrates that, for z ~ 1.5-3, 
"AGN-dominated" and "starburst-dominated" SEDs have 
different Fg^m/ F24fj,in ratios, the dividing line being set at 
Fsfiin/F24^i-n — 0.1 (see also Yan et al. 2005): starburst 
galaxies (Arp 220-like SED) are generally below this value, 
while AGN-powered sources (Mkn 231-like SED) are above 
it. Most (401) of the 510 sources within the overlapping 
IRAC-MIPS area present Fs^m/ F24fj,in < 0.1 so that - to 
a first order - can be classified as "starbursts" . 

Indeed, Fig. 4 shows that the shapes of the differ- 
ential 24 ^m counts for the two starburst and AGN sub- 
populations are very different. "AGNs" dominate above 
~ 0.8 mJy, consistent with the observational evidence that 
IR spectra (obtained with the IRS: Infrared Spectrograph 
on Spitzer) for optically faint but sufficiently mid-IR bright 
sources (F24^im ^ 0.75 mJy) predominantly present the typ- 
ical shape of obscured AGNs (Houck et al. 2005; Yan et 
al. 2005; Weedman et al. 2006). On the other hand, "star- 



bursts" are found to prevail at fainter fluxes and therefore 
constitute the dominant class in the present MIPS-IRAC 
^24^111 ^ 0.35 mJy sample. 



3.2 Definition of the Sample 

The arguments presented throughout this Section show that 
optically faint, 24 /.im-selected objects typically identify star- 
forming galaxies in the redshift interval 1.6 ^ z 2.7. 

There are 793 R > 25.5 MIPS galaxies with 24 fim 
fluxes brighter than 0.35 mJy in the area of approximately 
3.97 square degrees (257.25° < RA(2000) < 261.75°; 58.6° 
< Dec < 60.35°) covered by KPNO data (cutting out the 
irregular regions close to the borders of the 24/xm field). 
This region encloses the 2.85 square degrees where also 8 fim 
data is available. The sources selected in the above fashion 
correspond to 7.4% of the total number of objects (10,693) 
brighter than -F24^m = 0.35 mJy found in the same area and 
constitute the sample which will be used in the following 
statistical analyses. 

It is worth noting that, while the adopted 24 /im limit 
ensures completeness for what concerns the mid-IR selec- 
tion of the sample, the optical R > 25.5 cut is somewhat 
arbitrary. In fact, the studies presented in § 3.3 find sources 
in the 1.6 < z < 2.7 redshift range having magnitudes 
brighter than our chosen value. On the other hand, some 
of the sources with J? ~ 24 observed by the above authors 
turned out to have lower redshifts. One therefore has that a 
cut at _R = 25.5 ensures that the overwhelming majority (if 
not all) of the selected objects lie in the 2; ~ 2 range, while 
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Figure 4. Comparison of the differential counts of "starburst- 
dominated" (-F8^m/^24fim < 0.1; open (red) circles) vs "AGN- 
dominated" (-F8^m/^24(jm > 0.1; filled (blue) circles) galaxies 
in the F24^m > 0-35 sample of sources fainter than R = 25.5. 
Counts refer to Ai^24/jm = 0.1 mJy bins and are normalized to 
the total number of objects (510) in the overlapping MIPS-IRAC 
region. 



relaxing the optical magnitude limit may lead to a contam- 
ination of the sample, while only adding a modest fraction 
of sources (see also § 4 below). For example, if we lower the 
limit to 7? = 24.5 we only add 139 objects to the adopted 
sample, corresponding to a fractional increase of 17.5%. 

While such an incompleteness in the optical selection of 
the MIPS sample is not expected to affect the clustering esti- 
mates as long as the redshift distribution of i? > 25.5 sources 
does not greatly differ in shape from that of all galaxies be- 
longing to the same population and endowed with mid-IR 
fluxes F24i_iin > 0.35 mJy, the same does not hold when con- 
sidering the space density of such sources; in this case, the 
quantity quoted in § 4 will have to be considered as a mere 
lower limit. 



4 CLUSTERING PROPERTIES 

A mere visual inspection of the sky distribution of the 24 pm 
sources with R > 25.5 which identify the sample presented 
in § 3.4 (filled circles in Fig. 5), indicates that these objects 
are strongly clustered, much more than the sources in the 
full ^24^111 > 0.35 mJy First Look Survey (small dots). 

The standard way to quantify the clustering properties 
of a particular class of sources of unknown distance is by 
means of the angular two-point correlation function w{9) 
which measures the excess probability of finding a pair in the 
two solid angle elements dQi and dQ2 separated by an angle 
6. In practice, w{9) is obtained by comparing the actual 
source distribution with a catalogue of randomly distributed 
objects subject to the same mask constraints as the real 
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Figure 5. Sky distribution of the 793 optically faint (R > 25.5) 
sources in the -F24/jm > 0.35 mJy MIPS sample found within the 
area of 3.97 square degrees covered by KPNO data (red filled cir- 
cles; see text for details). The distribution of all the 24/jm sources 
brighter than the same fiux limit and enclosed in the same region 
of the sky is also shown for comparison (small black dots). 
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Figure 6. Angular correlation function ■u;(0) for R > 25.5 sources 
(filled circles) and for the whole F24^m > 0.35 mJy MIPS sample 
(open circles). The solid and the dashed lines show the best power- 
law fits to the data. 



data. We chose to use the estimator (Hamilton 1993) 
DD ■ RR 



■]{6) = 4 X 



1, 



(1) 



{DRY 

in the range of scales 10~^ ^ ^ ^ 1 degrees. DD, RR and 
DR are the number of data-data, random-random and data- 
random pairs separated by a distance 9. The random cata- 
logue was generated with twenty times as many objects as 
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the real data set, and the angular distribution of its sources 
was modulated according to the MIPS coverage map, so 
that the instrumental window function did not affect the 
measured clustering. 

The resulting angular correlation function w{6) for the 
obscured {R > 25.5), -F24/im > 0.35 mjy sources is shown by 
the filled circles in Fig. 6. The errors have been computed 



Sw{e) 



1 + w(6>) 



(2) 



Since the distributions are clustered, this (Poisson) es- 
timate for the errors only provides a lower limit to the 
real uncertainties. However, it can be shown that over the 
considered range of angular scales this estimate is close to 
those obtained from bootstrap resampling (e.g. Willumsen, 
Preudling & Da Costa, 1997). On the other hand, the above 
estimate does not take into account the uncertainties on the 
sample selection, which we are unable to quantify but may 
be substantial. We tentatively allow for them by doubling 
the Poisson errors in the following analysis and in the Fig- 
ures. The above analysis was repeated by using the Landy 
& Szalay (1993) estimator, and we found virtually identical 
results. 

We have also investigated the possibility that the clus- 
tering properties of high-« starburst galaxies are contami- 
nated by obscured AGNs (see §3.2), which may cluster dif- 
ferently. However, removing candidate AGNs, i.e. objects 
with F8fj.ui/F24^m > 0.1 (or alternatively F24^m > 0.8 mJy) 
which make up for ~20 per cent of the total sample, leaves 
the angular correlation function essentially unaffected, indi- 
cating that AGN-powered sources have clustering properties 
similar to those of starbursts. 

On the other hand, if we somewhat relax the opti- 
cal magnitude limit, e.g. we decrease it from R > 25.5 
to ii > 24.5, the estimated w{6) becomes significantly 
noisier, even though the fraction of added sources is only 
17.5% of the original sample (similar to that of "AGNs"), 
suggesting that a substantial portion of optically brighter, 
F24iJ,m > 0.35 sources are at lower redshifts. Including still 
optically brighter sources, the angular correlation function 
rapidly decreases, approaching that obtained for the whole 
24 /xm-selected sample (open circles in Fig. 6; in this case 
the associated errors simply correspond to la). 

If we adopt the usual power-law form. 



w(6>) = Ae' 



(1-7) 



(3) 



the parameters A and 7 can be estimated via a least-squares 
fit to the data. However, given the large errors on w, we 
choose to fix 7 to the standard value 7 = 1.8. We then 
obtain ^ = (7 ± 2) x 10"^ (solid line in Fig. 6); the point 
on the top left-hand corner has not been considered in our 
analysis because it corresponds to an angular scale close 
to the resolution of the instrument and therefore, despite 
the accuracy of the deblending technique applied to produce 
the original MIPS catalogue, it may be affected by source 
confusion. 

The amplitude A is about three times higher than that 
derived by Fang et al. (2004) for a sample of IRAC galaxies 
selected at Spm {A ~ 2.34 • 10~^), and about eight times 
higher than that obtained for the whole J24pm > 0.35 mJy 
MIPS dataset (^ = (9 ± 2) • 10"*, dashed line in Fig. 6). 



The angular correlation function is related to the spatial 
one, ^{r,z), by the relativistic Limber equation (Peebles, 
1980): 



w(6>) 



(4) 



where x is the comoving coordinate, F{x) gives the correc- 
tion for curvature, and the selection function $(a;) satisfies 
the relation 



Jo 



^x)F-\x)x'^dx = 



N{z)dz, 



(5) 



in which J\f is the mean surface density, fls is the solid angle 
covered by the survey, and N{z) is the number of sources 
within the shell {z, z + dz). 

If we make the simple assumption, consistent with the 
photometric and spectroscopic information summarized in 
§ 3, that A'"(2:) is constant in the range 1.6 ^ z ^ 2.7 and 
adopt a spatial correlation function of the form £,{r,z) = 
(r/ro)'^'^, independent of redshift (in comoving coordi- 
nates) in the considered interval, we obtain, for the adopted 
cosmology, ro(z = 1.6 - 2.7) = U.OtH Mpc. The clus- 
tering radius increases (decreases) if we broaden (narrow) 
the redshift range. If we instead adopt the redshift distri- 
bution predicted by the model of Granato et al. (2004; see 
§ 6) we get: ro{z = 1.6 - 2.7) = 15.2t|^ Mpc. Note that the 
assumption of a redshift independent comoving clustering 
radius is borne out by observational estimates for optical 
quasars (Porciani et al. 2004; Groom et al. 2005) which - 
according to the Granato et al. (2004) model - correspond 
to a later evolutionary phase of the AGNs hosted by 24 /xm 
sources. 

The above value of ro is in good agreement with the 
estimates obtained in the case of ultra-luminous infrared 
galaxies over 1.5 < 2 < 3 (Farrah et al. 2006a,b), and 
also matches that found by Magliocchetti & Maddox (1999) 
in the analysis of the clustering properties of galaxies in 
the Hubble Deep Field North selected in the same redshift 
range. Massive star-forming galaxies at 2 ~ 2 thus appear to 
be amongst the most strongly clustered sources in the Uni- 
verse. Locally, their clustering properties find a counterpart 
in those exhibited by radio sources (sec e.g. Magliocchetti 
et al. 2004) and are only second to those of rich clusters of 
galaxies (e.g. Guzzo et al. 2000). The implications of this 
result will be investigated in the next Sections. 

Under the assumption of a uniform redshift distribution 
in the range 1-6 < 2 < 2.7 and for the adopted cosmology, 
the mean comoving space density of sources with R > 25.5 
and F24mm > 0.35 mJy is: 



nobs(l-6 <z< 2.7) ~ 1.5 • 10"^ Mpc"^. 



(6) 



5 THE HALO OCCUPATION NUMBER (HON) 

A closer look to Fig. 6 shows that a simple power-law pro- 
vides a good fit for the measured 'w{6) only over the range 
0.007° ^ < 0.5°. Even though masked by large error bars, 
a hint of a steepening can in fact be discerned on the small- 
est angular scales and may also be present at the largest 
angles probed by our analysis. The small-scale steepening 
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is intimately related to the way the sources under exam oc- 
cupy their dark matter haloes, issue which will be dealt with 
throughout this Section via the so-called Halo Occupation 
Scenario, while the steepening on large angular scales is most 
likely due to the high redshift of these sources (0.5° in the 
adopted cosmology and for a redshift z = 2 correspond to 
a scale of ~ 45 Mpc above which the real-space correlation 
function rapidly approaches zero). 



5.1 Setting up the formalism 

The halo occupation function is defined as the probability 
distribution of the number of galaxies brighter than some lu- 
minosity threshold hosted by a virializcd halo of given mass. 
Within this framework, it is possible to show (see e.g. Pea- 
cock & Smith 2000 and Scoccimarro ot al. 2001) that the 
distribution of galaxies within their dark matter haloes de- 
termines the galaxy-galaxy clustering on small scales. Since 
the distribution of sources within their haloes in general de- 
pends on the efHciency of gala^xy formation, clustering mea- 
surements can provide important insights on the physics 
of those objects producing the signal. This approach has 
been successfully applied in the past few years to a num- 
ber of cases, from local galaxies (Magliocchetti & Porciani 
2003; Zehavi et al. 2004) to higher redshift sources such as 
COMBO 17 and Lyman Break galaxies (e.g. Phleps et al. 
2005; Hamana ct al. 2003; Ouchi ot al. 2005) and quasars 
(Porciani, Magliocchetti & Norbcrg 2004). 

Our analysis follows the approach adopted by Maglioc- 
chetti & Porciani (2003) , which is in turn based on the work 
by Scherrer & Bertschinger (1991) and Scoccimarro et al. 
(2001). The basic quantity here is the halo occupation distri- 
bution function Pjv(M) which gives the probability of find- 
ing A'^ galaxies within a single halo as a function of the halo 
mass M. Given the halo mass function n(M) (number den- 
sity of dark matter haloes per unit comoving volume and 
logj^g(M)), the mean value of the halo occupation distribu- 
tion N{M) = {N){M) = J2n^Pn{M) (which, from now 
on, we will call the halo occupation number) completely de- 
termines the mean comoving number density of galaxies in 
the desired redshift range: 



/ 



ngai = / n(M) N{M) dM 



(7) 



Relations, analogous to cq. (7) and involving higher-order 
moments of Pn(M), can be used to derive the clustering 
properties in the framework of the halo model. For instance, 
the 2-point correlation function can be written as the sum 
of two terms 

ar)=e'ir)+e'ir). (8) 

The function which accounts for pairs of galaxies re- 
siding within the same halo, depends on the second facto- 
rial moment of the halo occupation distribution a{M) = 
{N{N — 1))(M) and on the spatial distribution of galax- 
ies within their host haloes p(x|M), normalized in such a 

way that p{y\M)d'^y = 1 where r^"' is the virial radius 
which is assumed to mark the outer boundary of the halo: 

e\r) = / "(^)^(^) dM [ p(x|M) p(x + r|M) d\. (9) 

J "gal J 



On the other hand, the term ^ , which takes into account 
the contribution to the correlation function coming from 
galaxies in different haloes, depends on both N{M) and 
p(x.\M) as follows 



I 



n{Mi)N{Mi 



I 



n{M2)N{M2) 



dM2 



X j p(xi |Mi) p(x2 IM2) e(ri2 |Mi , M2) dVi dV2, (10) 

where ^(r| Afi , A/2) is the cross- correlation function of haloes 
of mass Ml and M2, Xj denotes the distance from the centre 
of each halo, and ri2 is the separation between the haloes. 

For separations smaller than the virial radius of the 
typical galaxy host halo, the 1-halo term dominates the 
correlation function, while the 2-halo contribution is the 
most important one on larger scales. In this latter regime 
5(r|Mi,M2) is proportional to the mass autocorrelation 
function, i.e. ^{r\Mi,M2) ~ b{Mi) b{M2) ^dmir) , where 
b{M) is the linear bias factor of haloes of mass M. Note 
that all the difi'crent quantities introduced in this Section 
depend on the redshift z even though we have not made it 
explicit in the equations. 

In order to use the halo model to study the galaxy clus- 
tering, one has to specify a number of functions describing 
the statistical properties of the population of dark matter 
haloes. In general, these have either been obtained analyti- 
cally and calibrated against N-body simulations, or directly 
extracted from numerical experiments. 

For the mass function and the linear bias factor of dark 
matter haloes we adopt here the model by Sheth & Tormen 
(1999), while we write the two-point correlation function of 
dark-matter haloes as (see e.g. Porciani & Giavalisco 2002; 
Magliocchetti & Porciani 2003) 



^{r\Mi,M2)= 



^dm{r)bi{Mi)b2{M2) iir>rl 



-1 



otherwise, 



(11) 



where the mass autocorrelation function, ^dm{r), is com- 
puted with the method of Peacock & Dodds (1996) and the 
above expression takes into account the spatial exclusion 
between haloes (i.e. two haloes cannot occupy the same vol- 
ume) . We will also assume that the distribution of galaxies 
within their haloes traces that of the dark matter and we 
adopt for p(x|M) a Navarro, Prenk & White (1997) pro- 
file with a concentration parameter obtained from eqs. (9) 
and (13) of Bullock et al. (2001). In fact, Magliocchetti & 
Porciani (2003) showed that NFW profiles are well suited 
to describe the correlation function of local 2dF galaxies. 
Wc note however that the uncertainties associated to our 
estimates of w{9) do not allow us to discriminate between 
different forms for p(x|M) as long as they are sensible ones 
(e.g. profiles of the form p oc with 2</3<3, /3 = 2 
corresponding to the singular isothermal sphere case). 

The final key ingredient needed to describe the cluster- 
ing properties of a class of galaxies is their halo occupation 
function Pn(M). In the ideal case Pn{M) is entirely speci- 
fied by the knowledge of all its moments which in principle 
can be observationally determined by studying galaxy clus- 
tering at any order. Unfortunately this is not feasible in 
practice, as measures of the higher moments of the galaxy 
distribution get extremely noisy for n > 4 even for local 
2-dimensional catalogues (see e.g. Gaztanaga, 1995 for an 
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Figure 7. Best fit to the observed angular correlation function 
w{6) of our sources, in the HON framework for a = 0.2, M,^[^ = 
lO^^-"^ Mq, logio(No) = —0.3. The point on the smallest angular 
scale has not been taken into account, because of uncertainties in 
source deblending. 
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Figure 8. Average number of MIR-bright (F24pm > 0.35 mjy), 
optically faint {R > 25.5), galaxies per dark matter halo of spec- 
ified mass M in the redshift range z ~ [1.6 — 2.7]. The solid line 
corresponds to the best-fit values of eq. (15), while the dashed 
lines correspond to cases with Ax^ = 1 (see text). 



analysis of the APM survey) . On the other hand, the present 
work relies on measurements of the two-point correlation 
function, which only depends on the first two moments of 
the halo occupation function, N{M) and a{M). 

Following Porciani et al. (2004; see also Magliocchetti 
& Porciani 2003 and Hatton et al. 2003) we parameterize 
these quantities as: 



N{M) = 



and 



No{M/M^ 




if M > M^in 
if M < Mmin 



a(M) = l3{MyN{M), 



(12) 



(13) 



where /3(M) = 0,log(M/Mmin)/log(M/Mo), 1, respectively 
for N{M) = 0, N{M) < 1 and N{M) > 1. The opera- 
tional definition of Mq is such that N{Mo) — 1 (see e.g. 
Porciani et al. 2004), while Afmin is the minimum mass of a 
halo able to host a source of the kind under consideration. 
More and more massive haloes are expected to host more 
and more galaxies, justifying the assumption of a power-law 
shape for the halo occupation number. As already pointed 
out in Porciani et al. (2004), eq. (12) is more general than 
the commonly used N{M) = (M/Mo)" which, for a = 0, 
automaticahy implies N{M) = 1 at any Al > Mq. 

As for the variance a{M), we note that the high- mass 
value for l3{M) simply reflects the Poisson statistics, while 
the functional form at intermediate masses (chosen to fit 
the results from semi-analytical models - see e.g. Sheth 
& Diaferio 2001; Berlind & Weinberg 2002 - and hydro- 
dynamical simulations - Berlind et al. 2003) describes the 
(strongly) sub-Poissonian regime. We assume the various 
quantities describing the HON not to vary in the consid- 
ered redshift range. Although a simplification, this choice 
is partially justified by the results obtained for other ex- 
tragalactic sources sampling the same redshift range of our 



dataset (e.g. quasars, see Porciani et al. 2004), which indeed 
show the relevant parameters associated to N{M) to stay 
constant with look-back time. 



5.2 Results 

In the application of the HON formalism to the present sam- 
ple we allowed the parameters of eq. (12) to vary within the 
following ranges: 



< Q < 2 

10" Mq < Mmin < 10" Mq 

-2<logio(Aro) < 1 . 



(14) 



Values for these parameters have been determined through a 
minimum technique by fitting the observed w{9) (except 
for the smallest angular scale point, cfr. §4). The angular 
correlation function was computed from eq. (4), with ^(r) 
given by eq. (8) and the redshift distribution yielded by the 
Granato et al. (2004) model (see § 6). 

We find that the best fit to the w{0) alone is obtained 

for: 



0.2 



+0.7 
0.2 



logio(Af„,in/MQ) 

logio(iVo) = -0.3 



13.5l°j 

+0.2 
1.7 



(15) 



where the quoted errors correspond to Ax = 1- We note 
however that the sampled correlation function bins are not 
completely independent, so that the error estimate is only 
indicative. 

The parameters are correlated with each other. In par- 
ticular, higher values for Mmin correspond to lower values of 
a (see Fig. 8) . Furthermore, as A'o is always found to be less 
than 1 and the index a is rather flat, on average there is less 
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than one of such star-forming galax;ies per dark matter halo 
(sub-Poissonian regime) except at the highest /cluster- Uke 
mass scales. 

It may be noted that the data on w{9) can only effec- 
tively constrain Mmin, while allowing for relatively broad 
ranges in the case of A'o and a. This can be esisily under- 
stood since, for typical halo masses >> M* the halo bias 
function b{M) is a steep function of M, so that even small 
variations of Mmin result in large variations of the predicted 
w{9) on intcrmcdiatc-to-largc angular scales. On the other 
hand, A^o can only be constrained by data on small angular 
scales (one-halo regime) via the theoretical variance cr{M). 
But the one-halo regime is represented by just one data point 
and furthermore, as long as the regime is sub-Poissonian, the 
predicted one-halo correlation function is only mildly depen- 
dent on this quantity. 

Additional constraints on the three parameters charac- 
terizing the HON [eq. (12)] can be obtained from the es- 
timated comoving number density of our sources. In fact, 
any HON model must simultaneously bo able to reproduce 
both the first and second moment of the galaxy distribution, 
i.e. both the clustering properties and the observed number 
density of sources in a specified sample. As discussed in § 4, 
the estimate in eq. (6) is expected to provide a lower limit 
to the number density of high-redshift star-forming galax- 
ies with i<24Mm > 0.35, although with a substantial uncer- 
tainty, mostly related to our poor knowledge of the red- 
shift distribution. If wo then require Wgai in eq. (7) to bo 
> 6- 10~^ Mpc""^ (i.e. allow for an uncertainty of a factor of 
2.5) the permitted ranges for the parameters narrow down 
becoming: 



n n+OA 
logu,{Mmin/MQ 



13.4 



+0.1 
0.3 



(16) 



logi„(Aro) 



-0.3 



+0.2 



which are the best-fit values for the HON [eq. (12)] satisfying 
both the clustering and the number density requirement. We 
note that, while as expected the range for Mmin is basically 
unaffected, the constraint on ngai greatly shrinks the allowed 
region for A'o by cutting all those values which would have 
produced too few sources. The above best-fit values do not 
change significantly if instead of the N{z) deriving from the 
Granato et al. (2004) model we use the fiat redshift distri- 
bution introduced in §3; in this case we get a = 0.31q'2; 
logio(Mmin/M0) = 13.3t°:l; logio(A^o) = -OAt^l 

The theoretical angular correlation function corre- 
sponding to the above best-fit HON parameters is compared 
to the data in Fig. 7. The model correctly describes both 
the overall amplitude of w(6) and the rise on angular scales 
< 10~^ deg determined by the one-halo regime. The cor- 
responding Halo Occupation Number of high-redshift star- 
forming galaxies is presented in Fig. 8, which shows that the 
sources under exam are always associated to very massive 
structures, identifiable with groups-to-clusters of galaxies. 
As it is also possible to notice, such galaxies are reason- 
ably common in those massive structures, with an average 
of ~ 0.5—1 object per group, where the upper value is found 
in correspondence of the highest masses probed by our anal- 
ysis. The implications of these results will be investigated in 
the next Section, when discussing the nature of optically 
faint objects as selected at 24fj,m. 
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Figure 9. Number of 24 /Lim sources fainter than R = 25.5 (filled 
squares) and of all 24/^m MIPS sources counted, in AF24fim = 
0.1 mJy bins, in the 3.97 deg^ area covered by KPNO data. The 

lower panel represents the ratio between the above quantities as 
a function of the 24^m flux, while the dashed line in the top 
panel shows the predictions by Silva et al. (2004; 2005) for high-2 
proto-spheroidal galaxies. 
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Figure 10. Redshift distribution of dusty proto-sphcroidals with 
fluxes F24jim > 0.35 mJy, normalized to unity, obtained following 
Silva et al. (2004; 2005). 



6 NATURE OF THE SOURCES 

Before the Spitzer survey data became available, Silva et al. 
(2004; 2005) worked out detailed predictions for the counts 
and the redshift distributions of IR sources. In particular 
they predicted that for J24;jm ^ 0.35 mJy, MIPS surveys 
would have comprised a small, but significant (8-10%) frac- 
tion of objects in the redshift range ~ 1.5 2.6 (with a 
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tail extending up to z > 3; cfr. their Figure 27). At the flux 
limit (F24/jm > 83/iJy) of the MIPS survey of the Chan- 
dra Deep Field South (Papovich et al. 1984), Silva et al. 
(2004; 2005) predicted a surface density of proto-spheroids 
at z > 1.5 of ~ larcmin"'^, i.e. amounting to ~ 22% of 
the observed total surface density. This prediction was at 
variance with those of reference phenomenologicaJ models 
which, for F24pm > 83 /iJy, yielded (see Figure 2 of Perez- 
Gonzalez et al. 2005 and Figure 6 of Caputi et al. 2006) 
either very few (Chary ot al. 2004) or almost 50% (Lagachc 
et al. 2004) sources at « > 1.5. The redshift distribution by 
Perez- Gonzalez et al. (2005) for i<24,im > 83 /iJy, based pri- 
marily on photometric redshifts for starburst templates, has 
about 24% of sources at z > 1.5. This result was recently 
confirmed, within the errors, by the work of Caputi et al. 
(2006) who found ~ 28% of sources to lie in that z range. 

The basic difference between the work of Silva et al. 
(2004; 2005) and that of the other quoted models is that, 
while Chary et al. (2004) and Lagache et al. (2004) adopt 
a purely empirical/phenomenological approach to describe 
the high redshift population of galaxies selected at 24/um 
(by e.g. evolving the local luminosity function in luminosity 
and/or density), Silva ot al. (2004; 2005) consider a more 
physically grounded picture. In fact, according to Silva et 
al. (2004; 2005) the z > 1.5 population corresponds to mas- 
sive proto-spheroidal galaxies in the process of forming most 
of their stars in a gigantic starburst, whose evolution is de- 
scribed by the physical model of Granato ot al. (2004). This 
population is not represented in the local far-IR luminosity 
function, since local massive spheroids are essentially dust- 
free. We refer the interested reader to the Granato et al. 
(2004) paper for a full account of the physical justification 
and a detailed description of the model. Here wo provide a 
short summary of its main features, focusing on the aspects 
which are relevant to the present discussion. 

6.1 Overview of the Granato et al. (2004) model 

The model adopts the standard hierarchical clustering 
framework for the formation of dark matter haloes. It fo- 
cuses on the redshift range z>1.5, where a good approxi- 
mation of the halo formation rates is provided by the positive 
term in the cosmic time derivative of the cosmological mass 
function (e.g., Haehnelt & Rees 1993; Sasaki 1994). The sim- 
ulations by Wechsler et al. (2002), and Zhao et al. (2003a,b) 
show that the growth of a halo occurs in two different phases: 
a first regime of fast accretion in which the potential well is 
built up by the sudden mergers of many clumps of compa- 
rable mass; and a second regime of slow accretion in which 
mass is added in the outskirts of the halo, without affecting 
the central region where the galactic structure resides. This 
means that, even at high redshift, once created the haloes 
harboring a massive elliptical galaxy are rarely destroyed 
and get incorporated within groups and clusters of galaxies. 

The physics governing the evolution of the baryons is 
much more complex. The main features of the model can bo 
summarized as follows (see Granato et al. 2004, Cirasuolo et 
al. 2005, Lapi et al. 2006). During or soon after the formation 
of the host dark matter halo, the baryons falling into the 
newly created potential well are shock-heated to the virial 
temperature. The hot gas is (moderately) clumpy and cools 
quickly especially in the denser central regions, triggering 



a strong burst of star formation. The radiation drag due 
to starlight acts on the gas clouds, reducing their angular 
momentum. As a consequence, a fraction of the cool gas can 
fall into a reservoir around the central super-massive black 
hole, and eventually accretes onto it by viscous dissipation, 
powering the nuclear activity. The energy fed back to the gas 
by supernova explosions and black hole activity regulates the 
ongoing star formation and the black hole growth. 

Initially, the cooling is rapid and the star formation is 
very high; thus the radiation drag is efficient in accumulating 
mass into the reservoir. The black hole starts growing from 
an initial seed with mass ~ 10^ M© already in place at the 
galactic center. Since there is plenty of material in this phase, 
the accretion is Eddington (or moderately super-Eddington) 
limited (e.g.. Small & Blandford 1992; Blandford 2004). This 
regime goes on until the energy feedback from the black hole 
is strong enough to unbind the gas from the potential well, a 
condition occurring around the peak of the accretion curve. 
Subsequently, the star formation rate drops substantially, 
the radiation drag becomes inefficient, the storage of mat- 
ter in the reservoir and the accretion onto the black hole 
decrease by a large factor. The drop is very pronounced for 
massive haloes, Mvir ^ lO^'^ Mq, while for smaller masses a 
smoother declining phase can continue for several Gyrs, and 
the black hole and stellar masses can further increase by a 
substantial factor. 

Before the peak, radiation is highly obscured by the 
surrounding dust. In fact, those proto-galaxics arc extremely 
faint in the rest frame UV/optical/near-IR and are more eas- 
ily selected at far-IR to mm wavelengths. Nuclear emission 
is heavily obscured too, but since absorption significantly 
decreases with increasing X-ray energy of the photons, this 
may be detected in hard X-ray bands (Alexander ot al. 2005; 
Borys et al. 2005; Granato et al. 2006). The mid-IR region 
may also be particularly well suited to detect such an emis- 
sion, because the dust temperature in the nuclear torus, hot- 
ter than that of the interstellar medium, makes this compo- 
nent more prominent, and the optical depth is relatively low. 

6.2 Model versus observations 

The 24/jm counts of candidate high-z proto-spheroidal 
galaxies predicted by Silva et al. (2004; 2005) are compared, 
in Fig. 9, with those of MIPS sources fainter than R = 25.5. 
The total 24/xm counts for the MIPS sample are also shown 
for comparison. 

The fraction of optically faint sources decreases from 
~ 10% at the lowest 24/im fluxes to less than 5% at the 
brightest ones. The decrement of this fraction with increas- 
ing flux is slowed down or halted above ~ 0.8 mjy, when the 
'AGN' contribution takes over. 

The dashed line in Fig. 9 represents the predictions by 
Silva et al. (2004; 2005), based on the Granato et al. (2004) 
model, for the counts at 24/im of dusty proto-spheroidals 
undergoing intense star formation. It must be noted that 
Silva et al. adopted a highly simplified description of the 
very complex source spectra in the relevant rest-frame fre- 
quency range, where strong emission and absorption features 
present a broad distribution of equivalent widths. Also, al- 
though the model explicitly predicts a significant nuclear ac- 
tivity with an exponential growth of the central black hole 
mass during the active star forming phase, nuclear emission 
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was neglected in the calculations by Silva et al. (2004; 2005). 
Thus, an accurate match of the observed counts cannot be 
expected. Still, the observed counts of optically faint sources 
are remarkably close to the predictions, suggesting that 
objects in the present sample are higli-z proto-spheroidal 
galaxies. The redshift distribution of those sources brighter 
than 0.35 mjy at 24fj,m (Fig. 10), computed by following 
Silva et al. (2004; 2005), matches the redshift range esti- 
mated in the previous sections on the basis of photometric 
and spectroscopic evidences. 

Following again Silva et al. (2004; 2005) we obtain that 
~ 5% of proto-spheroidal galaxies with F24^m > 0.35 mJy 
have 850 /im flux > 10 m,Jy, consistent with the fact that 
none of the four sources in our sample lying in the 151 
arcmin^ area surveyed with SCUBA by Sawicki & Webb 
(2005) to the above 850 /jm flux limit was detected. The 
model predicts most (80-90%) of proto-spheroidal galaxies 
with -F24fjm > 0.35 mJy to have Fs^oum > 1 nijy, but these 
are only a small fraction (2-4%) of 850//m sources at this 
flux limit. 

We note that such a spread in the predicted 850 /itm 
fluxes reflects the spread in star formation rates (SFRs). 
The ~ 5% of proto-spheroidal galaxies selected at 24 ^m 
with -Fgsoum ^ 10 mJy are predicted to have SFR > 
lOOOAf0yr"\ while about 20-30% of the sample sources 
should have SFR > 5OOM0yr-\ and about 90% are char- 
acterized by SFR > 100 M© yr~^. Most of the sources have 
SFR ~ few X 100 Mq yr"^ 

The median halo mass, estimated from the model, 
is logio(Afvir/M0) ~ 12.7. The corresponding peak SFR 
ranges from 550 to ~ 800 M© yr~^ for virialization redshifts 
ranging from 3 to 4, i.e. is substantially higher than the typ- 
ical SFRs of the 24 /im sources. This means that, according 
to the Granato et al. (2004) model, the 24 /xm selection pref- 
erentially identifies sources in the phase when the effect of 
feedbacks has begun to damp the SFR, at the same time 
decreasing the optical depth, which, at earlier times, is very 
high even at rest-frame mid-IR wavelengths. In this phase 
the active nucleus is approaching its maximum luminosity 
and can therefore show up at relatively bright flux densities 
for a relatively short time, while the starburst luminosity is 
far higher than that of the AGN over most of its lifetime. 

Also, the estimated halo mass is substantially lower 
than that accounting for the clustering properties (cfr. § 5). 
According to the model, this difference is to be expected 
since the starburst phase of these haloes has a lifetime which 
is shorter than the Hubble time by a factor of ~ 5 at ~ 2. 
This means that starburst galaxies work as beacons sig- 
nalling the presence of larger haloes, typically hosting ~ 5 
galactic haloes with \ogiQ{Mvir /Mq) ~ 12.7, out of which 
only one is seen at 24 fim. 



7 CONCLUSIONS 

We have found that optically very faint {R > 25.5) galaxies 
selected at 24 /xm (F24^m > 0.35 mJy) are very strongly clus- 
tered. Their two-point angular correlation function has an 
amplitude which is about 8 times higher than that found for 
the full F24,ij,Ta > 0.35 mJy sample, and about 3 times higher 
than the one estimated by Fang et al. (2004) for a sample 
selected at 8 /xm. Radio, sub-mm, mid-IR, and optical pho- 



tometric data converge in indicating that these sources are 
very luminous star-forming galaxies set at redshifts 1.6 ^ 
a 3. Spectroscopic redshifts for sources with similar pho- 
tometric properties fall in the range 1.6 < z < 2.7. If sources 
have a relatively flat distribution in the above redshift in- 
terval, and we adopt the conventional power-law representa- 
tion for the spatial two-point correlation function, ^{r,z) = 
(r/ro)~^ *, we obtain, for the adopted cosmology, a comov- 
ing clustering radius of ro{z = 1.6 — 2.7) = 14.0^2^4 Mpc, 
implying that these sources are amongst the most strongly 
clustered objects in the universe. This result is in good agree- 
ment with the estimates for ultra-luminous infrared galaxies 
over the redshift range 1.5 < z < 3 obtained by Farrah et al. 
(2006a,b) by selecting sources with Fs^^m > 0.4 mJy, R > 22 
and bumps in either the 4.5 or the 5.8 /im IRAC channel: 
ro = 14.4 ± 1.99/i"^ Mpc for the 2 < 2; < 3 sample (bump 
in the 5.8 /xm channel) and ro = 9.40 ± 2.24/i~^ Mpc for the 
1.5 < z < 2.0 sample (bump in the 4.5 /im channel). 

The halo model provides a good fit of the observed 
angular correlation function for a minimum halo mass of 
~ lO^^ ** Mq. The number density of haloes above this mass 
if fully consistent with that of our sources, therefore provid- 
ing an independent test of the results derived from the w{9) 
alone. 

At rest-frame wavelengths ~ 8/xm, (corresponding to 
the selection wavelength for the typical redshifts a: ~ 2 of 
our sources) both the direct starlight and the interstellar 

dust emission arc relatively low, so that nuclear activity 
can more easily show up. In fact, indications that opti- 
cally faint Spitzer sources with F24^m ^ 1 mJy are AGN 
dominated have been reported (Houck et al. 2005; Weed- 
man et al. 2006). Comparing the redshift dependencies of 
the 8/xm/24/xm flux ratios corresponding to the SEDs of 
well-known galaxies such as Arp 220 (starburst galaxy) and 
of Mkn 231 (obscured AGN) we find that, over the red- 
shift range of interest here, starburst galaxies are expected 
to have Fg^m/F24;jm < 0.1 and AGN-dominated sources 
Fsixm/ F24,iiTii > 0.1. Adopting this criterion, we find that in 
our sample the latter sources dominate for F24(im > 0.8 mJy, 
while starburst galaxies prevail at fainter fluxes comprising 
~ 80% of the sample. No significant difference in the clus- 
tering properties of the two sub-populations was detected. 

Our optically faint sources that, as argued above, are 
most likely at 1.6 < 2 < 2.7, comprise ~ 7.4% of the com- 
plete -F24nm ^ 0.35 m.Jy sample. This fraction is remarkably 
close to the prediction by Silva ct al. (2004; 2005). These 
authors pointed out that the physically grounded evolution- 
ary model for massive spheroidal galaxies by Granato et al. 
(2004) implies that the active star-forming phase of these 
objects had to show up in 24 /xm MIPS surveys and esti- 
mated that they constitute ~ 8-10% of all sources brighter 
than F24^m — 0.35 mJy and cover the redshift range 1.5-2.6 
(with a tail extending up to z > 3; cfr. Figure 27 of Silva 
et al. 2004). At the flux limit (83 ^Jy) of the 24 /xm MIPS 
survey of the Chandra Deep Field South, the model predicts 
a surface density of ~ l.Oarcmiu^'^ for sources at z > 1.5 
(~ 22% of the total surface density of sources brighter than 
that flux limit), in nice agreement with the observational de- 
terminations by Perez- Gonzalez et al. (2005) and Caputi et 
al. (2006), yielding fractions of 24% and 28% (corresponding 
to surface densities of ~ 1.1-1.3 arcmin"^. 

The photometric (radio, sub-mm, mid-IR, optical) 
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properties of our sources are all consistent with their inter- 
pretation in terms of massive star-forming proto-spheroidal 
galaxies. The Granato at al. (2004) model features a tight 
connection between star-formation activity and growth of 
the active nucleus hosted by the galaxy. It thus naturally 
accounts for the indications of a dominant AGN contribu- 
tion for F24^m > 0.8 mjy. 

According to this model, most sources have high, but 
not extreme, star- formation rates (SFRs), SFR ~ few x 
100 Mq yr^^, well below the peak SFR reached during their 
evolution. This is because during the most active star- 
forming phases the optical depth of the interstellar medium 
is very high even at rcst-framc wavelengths of ~ 8/im. Thus, 
the 24 /im selection preferentially singles out sources in the 
phase when the effect of feedbacks has begun to damp the 
SFR, at the same time substantially decreasing the optical 
depth. In this phase, the active nucleus is approaching its 
maximum luminosity and shows up, for a short time, at the 
brightest flux levels, as is indeed observed. 

The median galactic halo mass, estimated from the 
model, is log^Q(Mvir/Af0) ~ 12.7, sensibly lower than that 
accounting for the clustering properties. This difference is 
due to the fact that the lifetime of the starburst phase of 
these haloes is shorter (by a factor ~ 5) than the Hubble 
time, so that starburst galaxies work as beacons signalling 
the presence of larger haloes, typically hosting ~ 5 galactic 
haloes with logj^Q(Mvir/MQ) ~ 12.7, out of which only one is 
bright enough at 24/jm to meet the observational selection. 

Once we account for this lifetime effect, the comoving 
density of proto-spheroidal galaxies at a ~ 2 matches that 
of logio(Mvir/Mo) ~ 12.7 haloes in the same epoch, and 
is substantially higher than what predicted by most semi- 
analytic models for galaxy formation. 
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